Learning Heuristic Search via Imitation

نویسندگان

Mohak Bhardwaj

Sanjiban Choudhury

Sebastian Scherer

چکیده

Robotic motion planning problems are typically solved by constructing a search tree of valid maneuvers from a start to a goal configuration. Limited onboard computation and real-time planning constraints impose a limit on how large this search tree can grow. Heuristics play a crucial role in such situations by guiding the search towards potentially good directions and consequently minimizing search effort. Moreover, it must infer such directions in an efficient manner using only the information uncovered by the search up until that time. However, state of the art methods do not address the problem of computing a heuristic that explicitly minimizes search effort. In this paper, we do so by training a heuristic policy that maps the partial information from the search to decide which node of the search tree to expand. Unfortunately, naively training such policies leads to slow convergence and poor local minima. We present SAIL, an efficient algorithm that trains heuristic policies by imitating clairvoyant oracles oracles that have full information about the world and demonstrate decisions that minimize search effort. We leverage the fact that such oracles can be efficiently computed using dynamic programming and derive performance guarantees for the learnt heuristic. We validate the approach on a spectrum of environments which show that SAIL consistently outperforms state of the art algorithms. Our approach paves the way forward for learning heuristics that demonstrate an anytime nature finding feasible solutions quickly and incrementally refining it over time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HC-Search: A Learning Framework for Search-based Structured Prediction

Structured prediction is the problem of learning a function that maps structured inputs to structured outputs. Prototypical examples of structured prediction include part-ofspeech tagging and semantic segmentation of images. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, t...

متن کامل

Learning to Search via Self-Imitation

We study the problem of learning a good search policy. To do so, we propose the self-imitation learning setting, which builds upon imitation learning in two ways. First, self-imitation uses feedback provided by retrospective analysis of demonstrated search traces. Second, the policy can learn from its own decisions and mistakes without requiring repeated feedback from an external expert. Combin...

متن کامل

HC-Search: Learning Heuristics and Cost Functions for Structured Prediction

Structured prediction is the problem of learning a function from structured inputs to structured outputs. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then...

متن کامل

Learning Generalized Reactive Policies using Deep Neural Networks

We consider the problem of learning for planning, where knowledge acquired while planning is reused to plan faster in new problem instances. For robotic tasks, among others, plan execution can be captured as a sequence of visual images. For such domains, we propose to use deep neural networks in learning for planning, based on learning a reactive policy that imitates execution traces produced b...

متن کامل

Imitation Learning (of Robots) Synonyms Learning/programming From/by Demonstration, Apprenticeship Learning

Imitation is the ability to recognize and reproduce others’ actions – By extension, imitation learning is a means of learning and developing new skills from observing these skills performed by another agent. Imitation learning as applied to robots is a technique to reduce the complexity of search spaces for learning. When observing either good or bad examples, one can reduce the search for a po...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Learning Heuristic Search via Imitation

نویسندگان

چکیده

منابع مشابه

HC-Search: A Learning Framework for Search-based Structured Prediction

Learning to Search via Self-Imitation

HC-Search: Learning Heuristics and Cost Functions for Structured Prediction

Learning Generalized Reactive Policies using Deep Neural Networks

Imitation Learning (of Robots) Synonyms Learning/programming From/by Demonstration, Apprenticeship Learning

عنوان ژورنال:

اشتراک گذاری